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Abstract 

An  algorithm  is  presented  for  the  consistent  recovery  of  replicated 
data  in  a  client-server  system.  The  algorithm  is  based  on  logging 
and  is  similar  to  the  optimistic  techniques  that  are  well  known  in  the 
literature.  However,  unlike  in  existing  optimistic  techniques,  explicit 
dependency  information  is  not  maintained.  Instead,  dependency  in¬ 
formation  is  estimated  from  the  ordering  of  messages  found  in  servers’ 
logs.  These  dependency  estimates  can,  in  general,  be  expensive  to 
compute.  It  is  therefore  shown  how  inexpensive  estimates  can  be  ap¬ 
plied  when  a  system  is  well  structured. 


1  Introduction 

Object  oriented  distributed  systems  are  becoming  increasingly  common. 
These  systems  provide  users  with  tools  for  building  abstract  data  objects. 
Such  an  object  generUly  consists  of  routines  for  maintaining  it  along  with 
an  interface  by  which  clients  access  it.  Only  the  interface  of  an  object  is 
visible  to  a  client;  implementation  details,  such  as  replication  and  failure 
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Figure  1:  A  portion  of  &  distributed  operating  system.  Depicted  are  two 
objects  representing  a  name  server  and  a  resource  allocation  man¬ 
ager.  Clients  or  processes  in  the  system  operate  by  first  registering 
themselves  with  the  naming  service  and  then  allocating  resources 
under  that  name. 


recovery,  are  hidden  vrithin  the  object  module.  Figure  1  depicts  such  a 
system. 

Failure  recovery  in  these  systems  is  often  accomplished  through  the  use 
of  logging.  By  writing  to  a  log  Hie  the  sequence  of  updates  that  occurs 
to  an  object,  the  object’s  state  can  be  reconstructed  after  a  failure.  How¬ 
ever,  because  the  states  of  objects  may  be  related,  consistency  problems 
potentially  arise  if  object  logs  are  not  coordinated.  For  example,  in  the 
system  of  Hgure  1  the  state  of  the  resource  manager  is  dependent  on  the 
state  of  the  name  server;  only  rostered  clients  may  allocate  resources. 
Suppose  a  failure  causes  a  client  registration  record  to  be  lost  (not  logged). 
If  resource  allocations  are  logged  for  this  client,  then  the  system  may  later 
recover  into  a  state  that  reflects  the  client’s  allocations  without  reflecting 
its  registration. 

Transactions  can  be  used  to  enforce  consistency  between  logs.  For  ex¬ 
ample,  the  registration  of  a  client  name  and  its  allocation  of  resources  could 
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be  grouped  into  a  single  transaction  and  committed  as  a  unit,  in  order  to 
ensure  that  they  are  recovered  atomically.  However,  many  applications 
do  not  require  the  full  power  of  atomicity  that  transactions  provide.  Of¬ 
ten  a  weaker  form  of  consistency,  such  as  causal  consistency,  is  sufficient 
to  guarantee  correctness  [Lam78,BJ87a].  In  loosely  coupled  systems  such 
as  ISIS  [BJ87a],  this  weakening  of  consistency  usually  leads  to  improved 
performance  and  availability. 

This  paper  presents  a  log-based  mechanism  for  the  causally  consistent 
recovery  of  replicated  data  objects.  The  problem  of  representing  and  main¬ 
taining  causal  dependency  information  about  the  updates  on  objects  is 
not  a  simple  one.  Solutions  to  this  problem  have  been  devised  for  many 
different  settings,  including  inter-process  communication  [BJ87b,PBS88], 
highly  available  distributed  services  [LL86],  and  optimistic  failure  recovery 
[SY8S,JZ88].  Dependency  information  in  these  systems  is  maintained  ex¬ 
plicitly:  each  object  update  is  tagged  with  either  an  enumeration  of  the 
updates  on  which  it  is  dependent  or  with  a  timestamp  that  reflects  the 
update’s  causal  ordering. 

Unfortunately,  it  can  be  difficult  or  impossible  to  maintain  explicit  de¬ 
pendency  information  about  updates  when  the  set  of  object  clients  is  ei¬ 
ther  unknown  or  large  and  dynamically  changing.  The  recovery  algorithm 
presented  in  this  paper  avoids  the  need  to  maintain  explicit  dependency 
information  by  estimating  such  information  from  the  ordering  of  updates 
in  object  servers’  logs.  When  an  object  server  first  recovers  from  a  failure, 
it  approximates  the  set  of  dependencies  in  the  system  from  ordering  infor¬ 
mation  available  in  the  logs  of  servers.  This  information  is  then  used  to 
ensure  that  only  consistent  object  states  are  recovered. 

The  presentation  of  the  algorithm  is  divided  into  two  parts.  First,  in 
sections  2  through  6,  a  recovery  algorithm  is  derived  based  on  explicit 
knowledge  of  the  dependencies  between  object  updates.  In  section  2,  the 
formal  system  model  is  presented  and  in  section  3  the  notions  of  consis¬ 
tency  and  correctness  are  defined.  Based  on  these  deAnitions,  section  4 
outlines  several  consistency  problems  that  arise  through  the  use  of  logging 


and  presents  a  basic  sketch  of  the  recovery  mechanism.  The  actual  imple¬ 
mentation  of  the  recovery  algorithm  is  built  on  functions  for  consistently 
adding  and  deleting  entries  &om  logs.  These  functions  are  presented  in 
section  5  and  used  in  section  6  to  describe  the  recovery  algorithm. 

The  second  part  of  the  presentation  discusses  methods  for  estimating 
dependency  information  from  the  ordering  of  updates  in  object  logs.  Sec¬ 
tion  7  presents  several  dependency  estimates  and  describes  how  they  can  be 
used  in  the  recovery  algorithm  in  place  of  the  values  they  approximate.  In 
general,  the  estimates  used  can  be  expensive  to  compute.  Because  of  this, 
section  8  describes  a  special  class  of  systems  in  which  inexpensive  estimates 
can  be  used  by  the  algorithm. 

The  material  presented  in  this  paper  is  a  summary  of  that  in  [Kan89{. 
Much  of  the  formalism  and  all  of  the  proofs  have  been  omitted  for  the 
purposes  of  brevity. 

2  System  Model 

2.1  Partial  Replication 

Our  system  model  is  a  partially  replicated  variation  of  the  client-server  model 
of  computation.  A  set  of  servers,  denoted  SERV,  are  used  to  maintsun 
replicas  of  a  set  of  data  objects.  Each  server  maintains  replicas  of  several 
different  objects.  Data  objects  are  not  fully  replicated:  each  object  is 
replicated  at  only  some  of  the  servers.  We  let  D  denote  the  set  of  all  data 
objects  in  the  system  and  let  SERVn  denote  the  subset  of  servers  managing 
a  replica  of  object  A  (A  €  D). 

Objects  are  accessed  by  a  set  of  clients  that  may  or  may  not  differ  from 
the  set  of  servers.  In  order  to  access  an  object,  A  €  D,  a  client  broadcasts 
a  request  to  all  servers  of  A,  that  is,  to  all  members  of  SERV^.  Upon 
receiving  such  a  request,  each  server  of  object  A  performs  the  requested 
operation  on  its  local  copy  of  .4. 

We  make  no  eissumption  about  the  relative  ordering  of  client  requests. 
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Figure  2:  An  example  of  a  partially  replicated  printer  service.  In  figure  2(a), 
both  system  clients  are  broadcasting  job  submissions.  In  fig¬ 
ure  2(b),  the  job  submissions  have  completed  with  the  second 
client  broadcasting  a  notification  of  the  completion  of  the  first 
job. 

Servers  may  receive  the  same  requests  in  differing  orders,  if  the  orders 
are  mutually  consistent  and  correct  with  respect  to  the  application  being 
implemented.  It  is  the  responsibility  of  clients  to  ensure  that  such  correct 
orderings  are  perceived  by  the  servers.  To  this  end,  clients  may  use  a 
variety  of  broadcast  mechanisms,  each  differing  in  the  ordering  properties 
it  provides. 

As  an  example,  consider  figure  2.  This  figure  depicts  two  states  in 
the  execution  of  a  system  maintaining  information  about  a  printer  service. 
The  system  consists  of  three  servers  (/,  g,  and  h)  replicating  two  data 
objects  (Jolu  and  Comps).  The  object  Jobs  is  a  list  of  jobs  that  have 
been  submitted  for  printing  and  is  replicated  at  servers  /  and  g.  The 
object  Comps  is  a  list  of  completed  Jobs  and  is  replicated  at  servers  g 
and  h.  Figure  2(a)  depicts  a  stafe  of  the  system  in  which  two  clients 
{client  1  and  client  2)  are  submitting  jobs  for  printing.  Note  that  the  job 
submissions  will  be  received  in  different  orders  by  the  two  servers  of  object 
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Jobs.  Figure  2(b)  depicts  a  later  state  of  the  system  after  which  both 
job  submission  broadcasts  (jobi  and  joh^)  have  completed.  In  this  state, 
client  2  is  in  the  process  of  broadcasting  a  completion  notification  for  job^. 

2.2  Logging 

In  order  to  support  recovery  from  failures,  each  server  maintains  a  log  of 
the  requests  that  it  receives. 

Definition  2.1  A  log  is  a  totally  ordered  set  (I,  —>[,)  of  requests. 

Here,  L  is  the  set  of  requests  received  by  a  server  and  — is  the  order 
in  which  those  requests  were  received.  Only  update  requests  are  actually 
logged.  Read  only  requests  are  omitted  because  they  do  not  affect  an 
object’s  state.  Note  that  because  servers  may  receive  requests  in  different 
orders,  they  may  also  log  requests  in  different  orders. 

Definition  2.2  The  projection  of  a  log,  (I,  -*l),  onto  an  object,  A€  D,  is 
the  set  of  object  A  requests  in  the  log.  Formally, 

{Jj,  —>i)  lx  =  {*€Zi[xisa  request  on  object  A  } 

In  order  to  decouple  the  execution  speed  of  servers  from  the  speed  of 
logs,  servers  maintain  their  logs  asynchronously.  No  coordination  occurs 
between  the  logs  of  different  servers.  In  addition,  no  coordination  occurs 
between  the  state  of  a  server  and  its  log.  The  state  of  a  log  may  often 
lag  behind  the  state  of  its  server.  (This  approach  is  orthogonal  to  that 
of  write-ahead  logging  where  the  state  of  an  object  and  its  log  are  always 
synchronized  [BHG87].) 

2.3  Failures 

Servers  fail  by  crashing  [SS83].  When  a  server  crashes,  it  immediately  ceases 
to  receive,  process,  and  log  client  requests.  We  will  not  address  the  problem 
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Figure  3:  One  possible  execution  of  the  printer  service  of  figure  2.  Depicted 
ure  two  job  subnuMion  brosdcasts  {jobi  and  job] )  along  with  one 
Job  completion  notification  (compi).  Also  depicted  are  two  fail¬ 
ures.  Server  /  fails  at  time  t}  and  server  g  fails  at  time  ti-  In  the 
diagram,  horiioatal  lines  represent  process  (server  or  client)  ex¬ 
ecutions  while  diagonal  arrows  represent  request  message  broad¬ 
casts.  Dotted  lines  represent  the  logging  of  request  messages  by 
servers.  The  length  of  a  dotted  line  indicates  the  latency  between 
the  receipt  of  a  request  and  its  physical  logging. 


of  server  partitions  [DGMS85J.  When  a  server  is  functioning,  we  assume 
that  it  can  communicate  with  all  other  functioning  servers. 

Figure  3  depicts  a  possible  execution  of  the  printer  service  shown  in  fig¬ 
ure  2.  In  the  example,  server  /  fails  at  time  ti  after  receiving  (and  logging) 
jobi,  but  before  receiving  johj.  Server  g  fails  at  time  after  receiving  (and 
logging)  all  three  requests.  Server  h  functions  continuously  through  out  the 
example,  receiving  and  logging  the  job  completion  notification  compi .  Note 
that  the  final  logs  of  servers  /  and  g  do  not  agree  on  the  state  of  object 
Jobs.  Not  only  do  they  contain  different  requests  for  the  object,  but  they 
reflect  different  orders  on  those  requests. 

2.4  System  State 

At  the  time  a  server  recovers,  the  objects  in  the  system  can  be  divided  into 
two  categories.  An  active  object  is  one  for  which  some  server  is  actively 
managing  a  replica.  An  tnacftve  object  is  one  for  which  all  servers  of  the 
object  have  failed  or  are  in  the  process  of  restoring  their  replicas. 

For  the  purpose  of  recovery,  the  state  of  the  system  can  be  summarized 
in  the  following  manner: 

Definition  2.3  A  slate  of  the  system  is  characterized  by  the  following  val¬ 
ues: 

For  each  data  object,  .4  €  D: 

ACT  A  The  set  of  servers  actively  managing  a  replica  of  ob¬ 

ject  A. 

RECa  The  set  of  servers  in  the  process  of  restoring  their 
replicas  of  object  A. 

FALa  The  set  of  failed  servers  of  object  .4. 

For  each  server,  /  €  SERV: 

{Lj,  -y)  The  log  of  server  /. 


ACTj^,  =  {<,}  RECj^,  =  0  FALjo,,  =  {/} 

ACTcamp$  —  {</>^}  RECcampi  =0  E  ALcampg  =  0 

(■t/j  — 

Figure  4:  A  state  from  the  printer  service  execution  given  in  figure  3.  De¬ 
picted  is  the  state  of  the  system  immediately  after  time  (t  • 

As  an  example,  consider  again  the  execution  of  figure  3.  Figure  4  shows 
the  state  of  this  system  immediately  after  time  ft. 

3  Causal  Dependencies 

During  the  execution  of  a  system  clients  can  interact  with  one  another.^ 
These  interactions  often  lead  to  data  dependencies  between  the  requests  the 
clients  issue.  For  example,  in  figure  2  the  job  completion  notification  comp\ 
is  causally  dependent’  on  the  job  submission  job^ :  a  job  cannot  complete 
until  after  it  has  been  submitted. 

Causal  dependencies  restrict  the  set  of  correct  request  orderings  that 
can  be  perceived  by  servers.  A  server  should  never  receive  two  causally 
related  requests  out  of  causal  order.  Earlier  it  was  stated  that  servers  may 
receive  requests  in  differing  orders,  provided  that  those  orders  are  correct 
for  the  application.  This  can  be  stated  more  precisely  by  saying  that  servers 
may  receive  requests  in  any  order  consistent  with  causality. 

*  Cheats  caa  interact  either  directly,  by  sending  messages  to  one  another,  or  indirectly, 
through  the  objects  managed  by  the  servers. 

’Many  types  of  dependencies  can  exist  between  client  requests.  In  this  paper,  however, 
we  will  focus  on  causal  dependencies. 


J:  compx 
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Request  System:  (R, 

R  =  {job],  jobi,  compi} 
job]  -</f  compi 


Figure  5:  A  request  system  representing  the  dependencies  in  the  printer  ser¬ 
vice.  The  system  consists  of  three  requests:  two  job  submissions 
and  a  job  completion  notification.  The  only  causal  dependency  in 
the  system  is  the  one  between  the  completion  notification  of  job] 
and  its  submission. 


3.1  Request  Systems 

The  causal  dependency  structure  of  an  application  can  be  summarized  by 
means  of  a  request  system. 

Definition  3.1  A  request  system  is  a  partially  ordered  set  (R, -<a)  of  re¬ 
quests. 

Here,  R  is  the  set  of  all  requests  made  by  clients  in  the  system  and  -<r  is 
a  partial  order  that  relates  all  pairs  of  causally  dependent  requests.  The 
partial  order  -<r  may  be  interpreted  as  meaning  that  if  two  requests  are 
related,  x  <r  y,  then  request  y  is  causally  dependent  on  request  x  (t.e. 
request  y  must  follow  request  x).  The  relation  -<r  is  equivalent  to  the 
^happens  before^  relation  of  Lamport  (Lam78].  Like  the  ** happens  before” 
relation,  -<r  is  transitive.  We  will  sometimes  use  the  notation  x.A  in  order 
to  refer  to  a  client  request  made  on  object  A. 

Figure  5  shows  the  request  system  for  the  example  given  in  figure  2. 
Note  that  causal  dependencies  hold  between  requests  made  by  different 
clients  as  well  as  between  requests  made  on  different  objects:  request  comp] 
is  dependent  on  request  job\,  even  though  the  former  is  made  by  client  1 
on  object  Jobs  while  the  latter  is  made  by  a  different  client  on  a  different 
object. 


It  is  the  responsibility  of  clients  to  enforce  any  request  ordering  con¬ 
straints  that  must  hold  between  their  requests.  Servers  simply  process  and 
log  requests  in  the  order  in  which  they  are  received.  One  possibility  is  for 
clients  to  use  reliable  ordered  broadcasts  [BJ87b,CM84,CASD86|  to  ensure 
the  proper  ordering  between  requests. 

3.2  Dependencies  and  server  logs 

Because  servers  log  requests  in  the  order  they  are  received,  casual  depen¬ 
dency  constraints  also  apply  to  logs.  That  is,  the  ordering  of  requests  in 
a  server’s  log  should  always  be  consistent  with  the  request  ordering  con¬ 
straints  of  the  application.  This  observation  can  be  formalized  as  follows: 

Definition  3.2  A  log,  (I^,— >y),  for  server  /  is  causally  consistent  with 
respect  to  a  request  system,  (A,  -<11),  if 

Vy.fl  €  1/  :  Vx.it  e  R{x  ^ay)  : 

if  €  SERVa)  ==*•  {x.A  €  Lf  X  -*j  y) 

3.3  Dependencies  and  recovery 

Causal  dependencies  also  affect  the  issue  of  consistency  between  the  states 
of  different  objects.  The  state  of  a  system  should  never  reflect  a  request 
unless  ail  of  the  causal  dependents  of  that  request  are  2dso  reflected  in  the 
state  of  the  system.  For  example,  the  printer  service  should  not  reflect  the 
completion  (compi)  of  the  first  job  unless  it  reflects  the  job’s  submission 
(jobx ).  Insuring  this  type  of  consistency  is  the  problem  at  the  heart  of  object 
replica  recovery.  The  problem  is  analogous  to  the  problem  of  generating 
checkpoints  along  a  consistent  cut. 

3.4  Maintaining  dependency  information 

Many  methods  exist  for  maintaining  causal  dependency  information  about 
the  updates  in  a  system.  One  method  is  to  tag  each  update  with  a  list  of 
identifiers  of  its  dependents;  this  is  the  approach  taken  in  Psync  [PBS88). 
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A  similar  method,  and  the  one  used  in  ISIS  [BJSTbj,  is  to  piggyback  each 
update  message  with  a  copy  of  each  of  its  dependents.  Another  method  is 
to  tag  each  update  with  a  timestamp  that  reflects  the  update’s  causal  order¬ 
ing  with  respect  to  other  updates;  this  approach  is  used  in  both  the  highly 
available  services  [LL86]  and  optimistic  failure  recovery  [SY85,JZ88J.  Each 
of  these  examples  illustrates  a  method  based  on  maintaining  dependency 
information  explicitly.  Unfortunately,  it  can  be  difficult  or  impossible  to 
maintain  explicit  dependency  information  when  the  set  of  clients  is  either 
unknown  or  large  and  dynamically  changing.  In  this  paper  we  examine  an 
alternate  approach  based  on  maintaining  dependency  information  implic¬ 
itly.  In  particular,  dependency  information  is  estimated  from  the  ordering 
of  updates  in  servers’  logs. 

4  Failure  Recovery 

Servers  use  their  logs  to  recover  from  failures  in  the  usual  way.  In  order  to 
reconstruct  the  state  of  a  failed  object  replica,  a  recovering  server  simply 
re-executes  the  sequence  of  requests  logged  for  that  object.  Once  the  state 
of  the  object  replica  is  restored,  the  recovering  server  begins  receiving  and 
processing  new  requests  for  it.  Several  synchronization  problems  potentially 
arise,  however,  if  the  states  recovered  by  servers  are  not  coordinated. 

4.1  Synchronixation  Problems 

Because  of  request  dependencies  and  uncoordinated  logs,  a  failed  server 
can  recover  its  replica  of  an  object  in  a  state  that  is  inconsistent  with  the 
states  of  other  object  replicas  in  the  system.  We  present  three  examples  to 
illustrate  how  such  inconsistencies  can  occur. 

One  type  of  inconsistency  can  occur  when  a  failed  server  recovers  its 
replica  of  an  object  that  is  already  active  in  ihe  system.  The  state  of  the 
replica  recovered  by  the  failed  server  will  be  the  state  of  the  object  from 
the  time  of  the  server’s  failure.  Since  the  time  of  this  failure,  however,  the 
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object  has  probably  undergone  changes  that  will  be  reflected  in  the  states  of 
the  active  replicas.  The  state  recovered  by  the  failed  server  will,  therefore, 
likely  disagree  with  the  active  replicas.  This  problem  can  be  illustrated  in 
figure  3.  Suppose  server  /  recovers  between  time  tj  and  time  t).  The  state 
it  recovers  for  object  Jo6s  (the  state  represented  in  its  log)  does  not  reflect 
the  submission  of  jobj.  Server  /  will  therefore  disagree  with  server  g  on  the 
set  of  submitted  jobs. 

This  problem  can  be  easily  solved  by  transferring  the  state  of  the  active 
object  replicas  to  the  failed  server  at  the  time  of  its  recovery.  The  recovering 
server  could  then  ignore  its  log  and  use  the  transferred  state  to  initialize 
its  object  replica.  This  is  the  approach  used  by  ISIS  [BJ87a]  and  will  be 
the  approach  taken  here.  We  refer  to  the  problem  of  initializing  replicas 
of  active  objects  as  the  JOIN  problem.  A  more  formal  discussion  of  this 
problem  is  given  below. 

A  similar  type  of  inconsistency  can  occur  when  several  failed  servers  all 
simultaneously  attempt  to  recover  their  replicas  of  the  same  inaciive  object. 
Because  each  server  probably  failed  at  a  different  time,  each  server’s  log 
probably  reflects  a  different  state  for  the  object.  It  is  therefore  likely  that 
each  server  will  recover  its  replica  in  a  state  that  disagrees  with  the  states 
recovered  by  the  other  servers.  This  problem  can  also  be  illustrated  in 
figure  3.  Suppose  both  server  /  and  server  g  recover  after  time  ti.  In  this 
case,  server  g  will  recover  submission  J063  while  server  /  will  not. 

In  order  to  solve  this  problem,  the  recovering  servers  must  cooperate 
and  agree  on  a  state  for  the  inactive  object.  Ideally,  this  state  should  be  as 
recent  as  possible.  In  synchronous  systems,  where  the  states  of  replicas  are 
coordinated,  the  most  recent  logged  state  is  that  of  the  last  server  to  fail 
[Ske85].  However,  in  asynchronous  systems,  this  is  not  true.  Any  server 
may  have  potentially  logged  the  most  recent  state.  It  is  even  possible  that 
different  servers  may  have  logged  different  requests.  In  this  case,  none  of 
the  logged  states  is  the  most  recent.  Each  contains  some  requests  that 
are  not  present  in  the  other  logs.  A  fairly  recent  state  can  generally  be 
constructed,  though,  by  merging  the  logs  of  all  recovering  servers. 
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Both  of  the  above  examples  illustrate  synchronization  problems  that 
are  rooted  in  the  asynchrony  of  logging  and  failures.  A  more  difficult  syn¬ 
chronization  problem  arises  from  the  presence  of  request  dependencies.  As 
shown  above,  an  object  replica  can  be  recovered  in  a  variety  of  states,  de¬ 
pending  on  who  is  recovering  the  replica  and  at  what  time.  Because  of 
this,  it  is  possible  that  replicas  of  two  different  objects  can  be  recovered  in 
causally  ineonsisient  states.  That  is,  one  object  can  be  recovered  in  a  state 
that  contains  a  request  for  which  causal  dependents  (on  other  objects)  were 
not  recovered. 

For  example,  consider  figure  6.  This  figure  shows  an  execution  of  the 
printer  service  similar  to  the  one  given  earlier.  However,  unlike  in  the  earlier 
execution,  server  /  fails  before  logging  any  request,  and  server  k  fails  at 
time  tf  (in  addition  to  server  g).  If  servers  /  and  h  were  both  to  recover 
(after  time  tj)  before  server  g,  the  system  would  be  in  a  causally  inconsistent 
state.  That  is,  the  system  state  would  reflect  compi,  the  completion  of  y 061, 
without  reflecting  the  submission  of  jobi. 

4.2  Synchroniaation  Phase 

In  order  to  solve  these  problems,  the  log  of  a  recovering  server  can  be 
synchronized  with  the  logs  (states)  of  the  other  servers  in  the  system  at  the 
time  of  recovery.  The  recovering  server’s  log  can  by  synchronized  with  the 
logs  of  active  servers  (on  the  states  of  active  objects)  as  well  as  with  the 
logs  of  other  recovering  servers  (on  the  states  of  inactive  objects). 

We  divide  the  recovery  sequence  of  a  failed  server  into  two  parts:  a 
JOIN  part  and  an  ACTIVATE  part.  The  JOIN  part  addresses  the  prob¬ 
lem  of  synchronizing  the  recovering  server’s  log  with  the  states  of  active 
objects.  The  ACTIVATE  part  addresses  the  problem  of  synchronizing  the 
recovering  server’s  log  with  the  logs  of  other  recovering  servers  on  the  states 
of  inactive  objects.  Figure  7  illustrates  the  relationship  of  these  two  parts 
in  the  recovery  sequence. 

The  JOIN  and  ACTIVATE  problems  are  formally  described  below  and 
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Figure  6:  An  example  of  cauially  inconsistent  recovery.  If  servers  /  and  h 
were  both  to  recover  after  time  tj,  they  would  recover  in  mutually 
inconsisteat  states  (server  k  would  reflect  the  completion,  eompi , 
of  jabi  while  server  /  woukl  iail  to  reflect  the  job’s  submisMon). 
Note  the  dotted  box  near  the  time  line  of  server  /.  This  box 
shows  the  point  at  which  the  job  submission  job\  would  have 
been  logged  by  server  /,  had  that  server  remained  functioning. 
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their  solutions  are  presented  in  the  following  sections.  In  order  to  simplify 
the  discussion  we  will  assume  that,  at  the  time  of  a  server  recovery,  all  active 
servers  of  an  object  have  received  and  logged  the  same  set  of  requests  for 
that  object  (although  possibly  in  different  orders).  We  will  refer  to  this  set 
of  requests  as  the  active  state  of  the  object.  Formally,  the  active  state  of 
object  A  is 

>15^  =  ^feACTA 

This  assumption  on  the  states  of  active  servers  may  appear  to  violate  the 
statement  that  servers  receive  and  log  requests  asynchronously.  However, 
the  assumption  only  applies  to  the  active  servers  of  an  object  and  only 
at  times  of  a  server  recovery.  It  should  be  pointed  out  that  enforcing  the 
assumption  is  relatively  easy.  The  details  can  be  found  in  [Kan89]. 

JOIN  Problem 

When  a  server,  /  €  SERV,  first  recovers  from  a  failure,  its  log  is  brought 
into  agreement  with  the  states  of  the  active  objects’  in  the  system.  The 
recovering  server’s  old  log,  (£ y,  — ►/)>  is  altered  to  create  a  new  log,  (£),  — */), 
that  agrees  ^with  the  logs  of  active  servers  on  the  states  of  active  objects. 
Formally,  a  new  log  for  server  /  is  generated  with  the  following  properties: 

•  The  new  log  is  causally  consistent. 

•  The  new  log  is  in  agreement  with  the  states  of  objects  that  are  active. 
That  is, 

'iAeD  (ACTa  #  0) :  /  €  SERVa  (£),  -J)  U  =  AS  a 

ACTIVATE  Problem 

Once  the  log  of  a  recovering  server  has  been  synchronized  with  the  states 
of  the  active  objects  in  the  system,  it  is  synchronized  with  the  logs  of  the 

^An  object  A  E  D  is  active  if  ACTa  /  9. 
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other  recovering  servers  in  the  system  on  the  states  of  inactive  objects.  This 
synchronization  is  done  one  inactive  object  at  a  time. 

Let  A  denote  an  inactive  object  that  is  being  recovered.  All  of  the 
recovering  servers  of  object  A  (all  of  the  members  of  RECa)  participate 
in  the  recovery  of  object  A.  A  new  state  is  chosen  for  object  A  that  is 
consistent  with  the  states  of  the  active  objects  in  the  system.  Each  of  the 
recovering  servers  then  installs  this  state  in  its  log  as  the  state  of  object  A. 
More  precisely,  the  old  logs  of  the  recovering  servers 

((L„^l)\f€RECt) 

are  altered  to  create  new  logs 

that  are  in  agreement  on  the  state  of  object  A.  Formally,  the  new  logs 
generated  for  the  recovering  servers  will  have  the  following  properties: 

•  Each  new  log,  (I},-*}),  is  causally  consistent. 

•  All  new  logs  agree  on  the  state  of  object  A.  That  is, 

'if,g^RECA:  {L),^))\a  =  {L;,-*;)\a 

•  The  state  of  object  A  reflected  in  the  new  logs  is  causally  consistent 
with  the  states  of  all  active  objects  in  the  system.  That  is,  for  any 
active  object,  B, 

V  X.A  €  (I*,  — •)  U  :  V  y.fl  €  {y.B  x.a)  :  y.B  €  ASr 

and 

V  y.B  e  ASb  :  V  *.i4  €  f?  (x.A  <r  y.B)  :  x.A  €  (£*,  U 

where  (I*,  — ►*)  is  any  of  the  new  logs. 

•  If  a  server,  /  €  RECa,  is  actively  managing  a  replica  of  some  object, 
B,  then  the  new  log  does  not  interfere  with  the  state  logged  for  that 
active  object.  That  is. 
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V  /  6  RECa  :  ^  B  eD  (ACTb  0)  : 

(^/,-^y)  \B  =  iLf,^f)\B 

Note  that  all  servers  participating  in  this  synchronization  phase  should 
have  previously  completed  their  JOIN  phases.  The  JOIN  phase  provides 
each  participating  server  with  information  about  the  states  of  active  objects 
in  the  system.  This  information  is  used  in  the  ACTIVATE  phase  in  order  to 
ensure  that  the  state  recovered  for  the  inactive  object  is  causally  consistent 
with  the  states  of  active  objects. 

Examples 

As  an  example  of  JOIN  and  ACTIVATE  consider  figure  6.  Suppose  that 
server  h  is  the  first  server  to  recover  after  time  t).  No  objects  will  be  active 
at  the  time  k  recovers.  The  JOIN  phase  of  server  h  will  not  therefore  need 
to  take  any  synchronization  actions.  Server  h  will,  however,  ACTIVATE 
object  Camps  by  replaying  its  log,  restoring  its  replica  of  Comps  to  a 
state  reflecting  the  completion  of  jobi-  Now,  suppose  server  /  recovers 
next.  Again,  no  synchronization  actions  will  be  taken  in  the  JOIN  phase 
of  /  because  the  object  that  it  servers  (Jobs)  is  inactive  at  the  time  of 
/’s  recovery.  Server  /  will  therefore  proceed  to  ACTIVATE  object  Jobs 
by  replaying  its  log.  Note  that  the  state  of  Jobs  reflected  in  /’s  log  is 
inconsistent  with  the  active  state  of  object  Comps.  In  order  to  restore  Jobs 
to  a  state  consistent  with  Comps,  f  will  add  request  jobx  to  its  log  before 
replaying  it.  If  server  g  then  recovers  last,  both  of  the  objects  it  serves  will 
be  active.  It’s  JOIN  phase  will  therefore  consist  of  synchronizing  its  log 
with  both  of  these  objects.  Server  g  accomplishes  this  by  deleting  request 
jobj  from  its  log. 

As  another  example,  suppose  that  server  h  is  not  the  first  server  to 
recover  after  time  t].  Instead,  suppose  that  server  /  is  the  first  to  recover. 
Again,  no  actions  are  taken  during  /’s  JOIN  phase.  Server  /  thus  proceeds 
to  ACTIVATE  object  Jobs  by  replaying  its  log.  Note  that,  unlike  before, 
object  Comps  is  not  active  at  this  time.  The  state  of  Jo6s  recovered  by 
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Recovery  Sequence  of  Server  f: 

JOIN  Phase: 

1.  for  each  A£D:  ACTa  0  A  /  6  SERVa  : 

choose  an  active  server,  g,  of  object  A 
synchronize  with  on  the  state  of  A 

2.  reconstruct  replicas  of  active  objects  from  (£y,  — 

3.  begin  processing  new  requests  on  active  objects 

ACTIVATE  Phase: 

4.  whUe  3A€D  :  ACTa  =  0  A  /  €  SERVa  : 

form  a  new  state  for  object  A  by  merging  the  logs  of 
all  of  its  recovering  servers  {REC a) 

-  if  the  new  state  is  inconsistent  with  the  state  of  any 
active  object  B  {ACTb  i=-  0)  then  abort  the 
activation  of  A  until  additional  servers  recover 
V  ^  €  RECa  '•  Install  the  new  state  in  the  log, 

(L,,  of  server  g 
reconstruct  replica  of  object  A  from 
begin  processing  new  requests  on  object  .4 

Figure  7:  Recovery  Outline 
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/  is  not  therefore  required  to  be  consistent  with  Comps',  server  /  is  free 
to  restore  Jobs  to  a  state  that  does  not  reflect  the  submission  of  any  jobs. 
Now,  suppose  that  server  g  is  the  next  to  recover.  During  its  JOIN  phase, 
server  g  will  synchronize  its  log  with  that  of  server  /  on  the  state  of  Jobs. 
To  do  this,  g  will  delete  both  job  submissions  from  its  log.  In  addition,  g 
will  also  delete  the  completion  notification  (compi)  in  order  to  preserve  the 
causal  consistency  of  its  log.  Once  g  has  restored  its  replica  of  Jobs  it  will 
proceed  to  ACTIVATE  object  Comps,  restoring  its  replica  of  that  object 
to  a  state  that  does  not  reflect  the  completion  of  any  jobs.  When  server 
h  finally  recovers,  it  will  delete  compi  from  its  log  and  restore  its  object 
replica  to  the  appropriate  state. 

5  Log  Transformations 

Our  algorithms  implementing  the  JOIN  and  ACTIVATE  phases  are  based 
on  functions  for  adding  and  deleting  requests  from  servers  logs.  The  main 
difficulty  in  designing  these  functions  is  ensuring  that  they  preserve  the 
causal  consistency  of  the  logs  on  which  they  operate. 

5.1  Log  Addition 

Consider  first  the  problem  of  adding  a  request  to  a  log.  Let  x.A  denote  a 
request  on  object  A  and  let  /  denote  a  server  of  object  A  (».e.  /  €  SERVji). 

Request  x.A  is  added  to  the  log  of  server  /,  (L^,— by  inserting  it 
into  the  log  at  some  point  where  the  resulting  log  order  remains  consistent 
with  -<it.  The  resulting  log,  however,  is  not  necessarily  causally  consistent. 
There  may  be  causal  dependents  of  request  x.A,  on  objects  served  by  /, 
that  are  not  present  in  the  resulting  log.  In  order  to  preserve  the  causal 
consistency  of  server  /’s  log,  these  missing  dependents  must  also  be  added. 

Let  DEPb{x.A)  denote  the  set  of  requests  on  object  B  that  are  causal 
dependents  of  request  a?. .4.  Formally, 

DEPb{x.a)  =  {y.B  e  R  \  y  x} 
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We  denote  the  function  of  adding  request  x.A  to  the  log  of  sei  /er  /  as 
add,,_4(I^,  — ►y).  Formally,  this  function  is  defined  as: 

where 

I  =  1/  U  {*-4}  U  [  U  DEPb{x.A)  ] 

{B  I  jeSBRVa) 

—*i  is  any  extension  of  — ►/  consistent  with  -</|. 

This  definition  can  easily  be  extended  to  accommodate  the  addition  of 
multiple  requests.  We  will  let  BddQ{L p  —* j)  denote  the  addition  of  all  of 
the  requests  in  Q  to  the  log  of  server  /.  The  definition  of  addQ(£y,— ►y) 
can  be  found  in  [Kan89]. 

5.2  Log  Deletion 

The  deletion  of  a  request  from  a  server’s  log  is  handled  in  a  manner  anal* 
ogous  to  the  addition  of  a  request.  As  above,  let  x.A  denote  a  request  on 
object  A  and  let  /  denote  a  server  of  object  A. 

Request  x.A  is  deleted  from  the  log  of  server  /  by  simply  removing  it, 
preserving  the  order  of  the  remaining  requests.  Again,  however,  the  result¬ 
ing  log  may  not  be  causally  consistent.  There  may  be  requests  remaining 
in  the  log  that  are  dependent  on  the  deleted  request.  In  order  to  preserve 
the  causal  consistency  of  the  log,  these  requests  must  also  be  removed. 

Let  CON{x.A  <  y.B)  denote  the  relation  that  request  y.B  is  not  causally 
dependent  on  request  x.i4.  That  is,  the  relation  CON{x.A  ■<  y.B)  is  true 
when  x.A  y.B  holds  (the  relation  is  contradicted). 

We  denote  the  function  of  deleting  request  x.A  from  the  log  of  server  / 
as  delate,,^(£y,-+y).  Formally,  this  function  is  defined  as: 

/)  = 

where 
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L  =  {y.B  €  Lf  I  y.B  ^  x..4  A  CON{x.A  <  y-B)} 

V  x,y  €  I  :  (x  -l  y)  {x  -./  y) 

As  was  the  case  for  log  addition,  log  deletion  can  easily  be  extended  to 
accommodate  the  deletion  of  multiple  requests.  We  let  deleteQ(£y,  — »y) 
denote  this  function.  Its  definition  can  also  be  found  in  [Kan89j. 

6  Synchronization  Solutions 

JOIN  and  ACTIVATE  are  implemented  by  adding  and  deleting  requests 
from  a  server’s  log.  A  recovering  server’s  log  is  altered  until  it  reflects  a 
state  that  is  in  agreement  with  the  states  of  the  other  servers  in  the  system. 
The  log  is  then  used  by  the  recovering  server  to  reconstruct  the  states  of 
its  object  replicas. 

6.1  JOIN  Implementation 

When  a  server,  /,  first  recovers  from  a  failure  its  log,  (!-/,—►/),  is  brought 
into  agreement  with  the  states  of  active  objects.  The  current  states  of  the 
active  objects  are  transferred  to  the  recovering  server  and  written  in  its  log, 
replacing  any  states  previously  logged  for  the  objects.  The  actual  log  of 
server  /,  is  altered  in  two  ways.  First,  any  request  for  an  active 

object  that  is  present  in  the  log,  but  not  present  in  the  object’s  transferred 
state,  is  removed  from  the  log.  These  requests  represent  updates  that  were 
never  recovered  for  the  object.  Formally,  these  non-recovered  requests  are: 

NRf  =  U  -  ASa] 

{>»  I  ACTa^9} 

Second,  any  request  present  in  the  transferred  state  of  an  object,  that  is  not 
present  in  the  log,  is  added  to  the  log.  These  requests  represent  updates 
that  were  missed  by  the  recovering  server  during  its  faulure.  Formally,  these 
missing  requests  are: 

MSf  =  U  [  -^Sa  -  1^  I 

{A  I 
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The  resulting  log,  the  log  solving  the  JOIN  problem  for  server  /,  is: 


6.2  ACTIVATE  Implementation 

Once  the  log  of  a  recovering  server  is  brought  into  agreement  with  the 
states  of  active  objects,  it  is  then  brought  into  agreement  with  the  logs  of 
other  recovering  servers  on  the  states  of  inactive  objects.  Let  A  denote  an 
inactive  object  in  the  system  (t.e.  ACT  a  —  0)-  And  let  A  be  such  that  all 
recovering  servers  (t.e.  all  members  of  RECa)  have  completed  their  JOIN 
phases. 

In  order  to  restore  object  A,  the  recovering  servers  of  A  first  agree  on 
a  state  for  it.  Ideally,  this  state  is  the  most  complete  state  constructible 
from  their  logs,  the  state  formed  by  combining  all  of  their  logged  requests: 


ISa 


u 


/€JWC4 


This  ideal  sitUe  can,  however,  be  inconsistent  with  the  states  of  some 
active  objects.  There  may  be  requests  in  the  ideal  state  that  have  dependen* 
cies  on  requests  for  other  objects  that  were  not  recovered  for  those  objects. 
In  order  to  preserve  the  consistency  between  objects,  these  inconsistent 
requests  must  be  omitted  from  the  recovered  state  of  object  A. 

We  let  SAFE{x.A)  denote  the  predicate  that  request  x.A  is  consistent 
with  the  states  of  all  active  objects.  That  is,  request  x.A  does  not  have  any 
dependencies  on  requests  not  recovered  for  those  objects.  Formally, 


SAFE{x.a)  =  A  (  DEPb{x.a)  C  ASb  1 

[B  1  ACTm**) 

The  state  recovered  for  object  A,  the  most  complete  and  consistent  state 
constructible  from  the  server’s  logs,  is  therefore: 

NSa  =  {x.A  G  ISa  1  SAFE{x..a)} 
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This  state  is  installed  into  the  logs  of  the  recovering  object  A  servers  in 
the  same  manner  that  the  transferred  states  were  installed  into  their  logs 
during  their  JOIN  phases.  For  each  recovering  server,  /,  the  new  state  is 
installed  in  two  parts.  First,  any  object  A  request  present  in  the  log  of 
server  /,  that  is  not  present  in  the  new  state,  is  removed  from  the  log. 
These  removed  requests  are  the  inconsistent  requests  that  were  omitted 
from  the  ideal  state.  Formally,  these  requests  are: 

NRj  =  -  NS  A 

Second,  any  request  present  in  the  new  state,  that  is  not  present  in  the  log, 
is  added  to  the  log.  Formally,  these  missing  requests  are: 

MSf  =  NSj^  -  (I/,-/)  U 

The  resulting  log,  the  log  solving  the  ACTIVATE  problem  for  object  A 
at  server  /,  is: 

(i/1-*/)  =  •d<liwrs/(d«l®t«ivfi,(i/i-*/)) 

Actually,  the  new  state  recovered  for  object  A  may  not  be  totally  con¬ 
sistent  with  the  states  of  active  objects.  It  is  possible  that  an  active  object 
may  have  a  dependency  on  a  request  that  is  not  recovered  for  object  A. 
This  can  happen,  for  example,  if  the  dependent  request  was  never  logged 
or  because  the  servers  that  did  log  it  never  recovered  in  time  to  take  part  in 
the  ACTIVATE  phase.  When  this  problem  of  a  missing  dependent  occurs, 
the  ACTIVATE  phase  must  abort  the  restoration  of  object  A.  It  must  then 
wait  for  additional  servers  of  object  A  to  recover  (hopefully  with  the  miss¬ 
ing  dependent  present  in  one  of  their  logs)  before  re-attempting  to  activate 
the  object. 

7  Dependency  Estimation 

The  implementations  of  the  JOIN  and  ACTIVATE  phases  assume  that 
servers  have  knowledge  of  -<r.  In  particular,  the  implementations  are  based 
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on  the  values  of  DEPb{x.a),  CON(x.A  ■<  y-B),  and  SAFE{x.A),  which  de¬ 
pend  on  ^R.  Servers,  however,  will  not  often  have  access  to  this  dependency 
information.  Thus,  servers  will  not  be  able  to  use  the  implementations  as 
they  have  been  presented.  Instead,  servers  will  have  to  estimate  dependency 
information  and  use  those  estimates  to  coordinate  their  logs. 

7.1  Dependency  Types 

Request  dependencies  can  be  estimated  from  the  orderings  of  requests  in 
servers  logs.  There  are  two  types  of  dependencies:  transitive  and  direct.  A 
transitive  dependency  is  a  dependency  formed  from  the  composite  of  other 
dependencies.  That  is,  a  dependency,  x  <r  y,  is  transitive  if  it  is  due  to  a 
sequence  of  direct  dependencies: 

X  -<R  2i  -<il  -<H  2n  -<fi  y 

A  dependency  is  direct  if  it  is  not  the  composite  of  other  dependencies. 
Direct  dependencies  are  the  basic  dependencies  in  a  system.  Formally, 

Definition  7.1  An  dependency,  x.A  ■<r  y.B,  is  direct  if 

--iz  e  R  :  {x  <rz  /\  z  <Ry) 

Definition  7.2  A  dependency,  x  ■<r  y,  is  transitive  if  it  is  not  direct. 

7.2  Object  Dependency  Relation 

As  stated  above,  servers  will  not  often  have  knowledge  of  the  causal  de¬ 
pendency  relation.  We  assume,  though,  that  servers  have  knowledge  of 
a  generalization  of  this  relation.  This  generalization  is  called  the  object 
dependency  relation. 

Definition  7.3  The  object  dependency  relation,  .A  -^r  B,  holds  between 
two  objects,  .4,  B  G  D,  if  it  is  possible  that  an  object  B  request  is  causally 
dependent  on  an  object  .4  request. 
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The  object  dependency  relation  only  tells  a  server  about  potential  depen¬ 
dency.  If  the  relation  >1  '^rB  holds,  then  it  is  possible  that  some  object  B 
requests  are  dependent  on  some  object  A  requests.  However,  a  server  does 
not  know  which  requests,  if  any,  are  related.  If  two  objects  are  unrelated, 
A  '/*R  B,  then  a  server  does  know  that  no  object  B  request  is  causally 
dependent  on  any  object  .4  request.  Formally,  this  can  be  summarized  as 
follows: 


'ix.A,y.B€R:  x.A-<Ry.B  A-^rB 

7.3  Basic  Estimates 

We  begin  this  section  by  presenting  our  basic  estimates  of  the  causal  de¬ 
pendency  relation.  These  estimates  are  designed  to  approximate  direct 
relationships  between  requests.  The  basic  estimates  are  used  later  in  this 
section  to  build  more  complex  estimates  for  approximating  transitive  rela¬ 
tionships. 

In  order  to  help  ensure  that  each  direct  dependency  is  represented  in 
the  order  of  some  server’s  log,  we  assume  that  any  pair  of  objects,  between 
which  direct  dependencies  may  hold,  have  overlapping  server  sets.  Formally, 
we  assume  that 

V  direct  x.A  -<r  y.B  :  (SERVj^  ^SERVr  ^  0) 

7.3.1  Request  Ordering 

Our  basic  estimate  of  the  relation  CON{x.A  <  y.B)  is  denoted  by  the 
relation  con^{x.A  -<  y.B).  It  is  designed  to  approximate  whether  or  not  two 
requests,  x.A  and  y.B,  are  directly  related.  In  particular,  when  the  estimate 
con9{x.A  ■<  y.B)  is  true,  it  is  guaranteed  that  request  y.B  is  not  causally 
dependent  on  request  x.A.  When  the  estimate  is  fabe,  though,  there  is  no 
guarantee  as  to  whether  or  not  the  requests  are  related. 

The  idea  behind  the  estimate  is  to  examine  servers’  logs  for  evidence 
that  the  requests  are  unrelated.  Recall  that,  according  to  the  causality 
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constraint  on  logs,  if  a  server  logs  one  request  then  it  must  previously  have 
logged  all  of  that  request’s  dependents  that  are  on  objects  managed  by 
the  server.  It  therefore  follows  that  if  some  server  has  logged  request  y.B 
before  request  x.A,  then  y.B  cannot  be  causally  dependent  on  request  x.A. 
Similarly,  if  some  server  of  both  objects  A  and  B  has  logged  request  y.B 
but  not  request  x.A,  then  request  y.B  cannot  be  dependent  on  request  x.A. 
Formally, 

Definition  7.4  The  relation  can’^{x.A  -<  y.B)  holds  between  any  two  re¬ 
quests,  x.A,y.B  e  R,  if  and  only  if  any  of  the  following  conditions  is  true: 

1. 

2.  3  /  €  SERVj^OSERVb  :  x.A, y.B  €  1/  A  y-B  x.A 

3.  3  /  €  SERVjiOSERVa  :  y.B  e  Lf  ^  x.A  i  If 


7.3.2  Dependency  Set 

Our  basic  estimate  of  DEPb{x.A)  is  denoted  dep%(x.A).  It  is  designed  to 
approximate  the  set  of  direct  object  B  dependents  of  request  x.A.  Our 
estimate  has  the  property  that,  when  defined,  it  is  either  equal  to  the  true 
dependency  set  or  an  overestimate  of  it. 

Like  the  previous  estimate,  this  estimate  is  based  on  the  causality  con¬ 
straint  for  logs.  As  pointed  out  earlier,  if  a  server  logs  one  request,  then  it 
must  previously  have  logged  all  of  that  request’s  dependents  that  are  on  ob¬ 
jects  it  serves.  Thus,  if  a  server  of  both  objects  A  and  B  has  logged  request 
x.A,  then  that  server’s  log  must  contain  all  of  the  object  B  dependents  of 
x.A  in  positions  preceding  x.A  in  the  log.  The  set  of  object  B  requests 
preceding  x.A  can  therefore  be  used  as  an  estimate  of  the  true  dependents. 

Some  of  these  object  B  requests  may  not,  however,  be  real  dependents 
of  x.A.  They  may  just  be  requests  that  happened  to  get  logged  before  x.A. 
Some  of  these  extraneous  requests  can  be  detected  and  eliminated  from  the 
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approximation  by  using  the  first  basic  estimate.  In  particular,  the  set  of 
object  B  dependents  of  request  x.A  can  be  estimated  as  follows: 


fl  i{-^3feSERVjif)SERVB  :  x.AG  Lf 


0  if  fl 

{y.B  I  3f  eSERVAOSERVa  : 

(x.A,y.B  €  Lf  y.B  — ►/  x.A  o.w. 

A  ->con\y.B  <  x.A))  } 


Note  that  the  estimate  is  undefined  when  no  server  of  both  objects  A  and 
B  has  logged  request  x.A.  Under  this  condition,  the  estimation  method 
presented  here  cannot  be  used. 


7.4  Compound  Estimates 

Transitive  dependencies  are  estimated  by  approximating  the  sequences  of 
direct  dependencies  out  of  which  they  are  built.  In  presenting  these  esti¬ 
mates,  the  following  definition  will  be  useful: 

Definition  7.5  A  chairs,  H,  is  a  sequence  of  related  objects. 

H  =  .4i  -^R  Aj  ...  ^R  -4n 

Intuitively,  a  chain  represents  a  sequence  of  objects  along  which  a  transitive 
dependency  may  occur.  If  a  chain  such  as  H  exists,  then  it  is  possible  for 
an  object  An  request  to  be  dependent  on  an  object  A\  request  through 
a  sequence  of  dependencies  on  requests  on  objects  4„_i,  4n-2,  ...,  Aj. 
However,  there  is  no  guarantee  that  such  a  transitive  dependency  exists. 
The  existence  of  a  chain  only  implies  the  potential  for  such  a  dependency. 
We  let  AB-C H AI N S  denote  the  set  of  all  chains  from  object  .4  to  object 

B. 

The  following  definitions,  based  on  chains,  will  also  be  useful: 
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Definition  7.6  A  3uk~chain  of  a  chain,  H, 

H  =  Ai  Aj  ...  An 

is  any  subsequence  of  its  objects 

=  Ann 

where  1  <  mt  <  mj  <  . . .  <  m,  <  n. 

Definition  7.7  The  A^Aj  sui'cAatn  of  a  chain,  H, 

ff  —  R  ■^1  R  •  *  •  R  -^n 

is  tne  sub-chain  of  objects  from  Ai  to  Aji 

Bi..j  =  Ai  Aj+i  ...  ~»i»  A, 

7.4.1  Dependency  Set 

We  denote  our  compound  estimate  of  DEPai^^A),  the  object  B  dependents 
of  request  by  the  request  set  depg{x.A).  This  estimate,  like  the  basic 
estimate,  has  the  property  that  when  defined  it  contains  all  of  the  object 
B  dependents  of  request  x.A,  plus  possibly  a  few  extraneous  requests. 

This  estimate  is  built  out  of  estir  ates  of  dependencies  along  individual 
chains.  In  order  to  estimate  the  objttt  B  dependents  of  request  x.A,  the 
dependents  along  each  chain  from  A  to  ^  are  separately  estimated.  These 
estimates  are  then  combined  to  form  a  complete  estimate  of  the  dependency 
set.  Specifically,  let  H  denote  any  chain. 

B  —  A\  ^^R  '"^R 

We  let  dejifg(xn-A^)  denote  our  estimate  of  the  object  At  dependents  of 
request  Zn.i4«  that  occur  along  chain  H.  That  is,  dep'^(zn.A,)  estimates 
the  set  of  object  Ai  requests  that  are  related  to  Zn.An  by  a  sequence  of 
dependencies  on  the  objects  in  chain  H. 
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The  estimate  depg{xn.An)  can  be  formed  in  many  ways.  First,  if  there  is 
a  server  that  manages  replicas  of  both  objects  Ai  and  An,  then  an  estimate 
can  be  obtained  by  simply  applying  the  basic  estimate  dep®  In 

general,  however,  the  server  sets  of  objects  i4i  and  An  will  not  overlap 
unless  the  objects  are  directly  related. 

Alternately,  an  estimate  can  be  formed  by  subdividing  the  problem. 
That  is,  an  estimate  can  be  formed  by  first  choosing  some  object  in  the 
chain,  /li  (1  t  <  n),  estimating  the  object  A,  dependents  of  Xn.An,  and 
then  estimating  the  object  Ai  dependents  of  the  object  Ai  dependents. 
Again,  if  the  server  sets  of  objects  Ai  and  Ai  overlap,  and  the  server  sets  of 
objects  Ai  and  An  overlap,  the  basic  estimates  can  be  applied  to  solve  each 
these  sub-problems.  That  is,  if  the  server  sets  overlap,  an  estimate  can  be 
formed  directly  along  the  sub-chain 

Ai  Ai  An 

However,  this  is  not  likely  to  be  the  case  unless  the  pairs  of  objects  are 
directly  related.  If  the  server  sets  do  not  overlap,  then  each  of  the  sub¬ 
problems  will  have  to  be  further  subdivided  in  a  manner  similar  to  the 
original  problem.  In  general,  the  problem  will  have  to  be  sub-divided  until 
a  sub-chain  of  H  is  found 

Al  Am,  A„,  ...  Am^  4„ 

1  <  TJii  <  mj  <  . . .  <  TTip  <  n 

in  which  each  pair  of  adjacent  objects  have  overlapping  server  sets.  An 
estimate  can  then  be  formed  along  this  sub-chain  by  Brst  approximating 
the  object  Am^  dependents  of  then  approximating  the  object  A,n^_, 

dependents  of  the  object  Am,  dependents;  and  similarly  approximating  the 
dependents  for  each  successive  object  down  the  sub-chain. 

This  process  can  be  specified  recursively  in  the  following  manner.  Note 
that  the  estimate  is  extended  to  operate  on  sets  of  requests.  That  is, 


dej^^Q)  denotes  the  set  of  object  Ai  dependents  of  the  object  An  requests 
in  Q:^ 


dePiiQ)  = 


U  dep^Ati^i.A,) 

t.AteQ 

U  dej/^^  Jdejirg.  Jxn.An)) 

where  1  <  t  <  n  is  chosen  so  thnt  the  estimntes  ue 
deAaed. 


11^11=2 

||tf||>2 


Note  that  there  may  be  several  choices  of  t  for  which  the  estimates  are 
defined.  Each  will  likely  yield  a  slightly  different  approximation  of  the  true 
dependency  set.  However,  each  is  guaranteed  to  contain  all  of  the  true 
dependencies  that  occur  along  H.  Because  of  this,  an  even  more  accurate 
estimate  of  the  true  dependency  set  (one  with  fewer  extraneous  requests) 
can  be  formed  by  intersecting  the  estimates  from  each  choice  of  i.  The 
estimate  can  thus  be  modified  as  follows: 


^PaiQ)  = 


U  dep\^{zt.Ai) 

u  n 

[  0  de^g^J^defi,^  ^{.Xn.An))  ]  ] 


ll^ll  =  2 
ll^ll  >  2 


Note  that  the  definitions  of  union  and  intersection  must  be  extended  to 
take  into  account  the  possibility  of  undefined  approximations.  This  is  done 
as  follows: 


(  1 

if  3*:  .! 

?i=-L 

1  g5i 

o.w. 

f  1 

ifVt:  ,! 

=  {  n  Si 

o.w. 

I  {i  I 

*The  length  of  a  chain,  H,  is  denoted  by  ||ff  {].  This  is  the  number  of  objects  in  the 
chain. 
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The  estimate  of  the  complete  set  of  object  B  dependents  of  request  x.A 
is  formed  by  combining  the  estimates  of  dependency  along  each  chain  from 
B  to  A.  Formally, 


dejl%{x.A)  = 


U 

HeBA-CHAlNS 


depf^ix.A) 


7.4.2  Request  Ordering 

We  denote  our  compound  estimate  of  the  predicate  CON{x.A  ■<  y.B)  by  the 
predicate  con‘*’(x.A  -<  y.B).  Like  the  basic  estimate,  this  predicate  has  the 
property  that,  when  true,  it  is  guaranteed  that  request  y.B  is  not  casually 
dependent  on  request  x.A.  And,  when  false,  the  requests  may  or  may  not 
be  related. 

As  with  the  dependency  set  estimate,  the  request  ordering  estimate  is 
built  up  from  estimates  of  ordering  along  individual  chains.  For  any  AiAn- 
chain,  H,  we  let  con'g{xx.A\  •<  x^.A^)  denote  our  estimate  of  whether  or  not 
request  x^.A^  is  causally  dependent  on  request  Xi.Ai  along  chain  H.  When 
the  estimate  is  true,  it  is  guaranteed  that  request  Xn.An  is  not  dependent 
on  request  Xi.At  by  a  sequence  of  dependencies  along  the  chain. 

The  idea  behind  this  estimate  is  to  search  the  chain  for  an  object.  A,,  at 
which  any  possible  dependency  path  from  zi  to  z„  is  broken.  That  is,  the 
chain  is  searched  for  an  object,  Aj,  such  that  none  of  the  A^  dependents  of 
z„  are  dependent  on  Z}.  The  existence  of  such  an  object  would  imply  that 
request  z„  is  not  dependent  on  request  zi  by  a  sequence  of  dependencies 
that  include  object  A^.  Because  .4^  is  a  part  of  chain  H,  this  would  in  turn 
imply  that  the  requests  cannot  be  dependent  along  chain  H. 

The  estimate  is  formed  by  examining  each  object,  Ai,  in  the  chain.  For 
each  such  object,  the  dependents  of  request  z„./l„  are  estimated.  These 
dependent  requests  are  then  recursively  tested  to  determine  if  any  of  them 
are  dependent  on  Xi.Ai.  The  formal  definition  of  this  function  is  given 
below.  Note  that  the  estimate  is  extended  to  operate  on  sets  of  requests; 
that  is,  con^(zi..4i  ^  Q)  denotes  our  approximation  of  whether  or  not  any 
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of  the  object  An  requests  in  Q  are  causally  dependent,  along  chain  H,  on 
request  Zi.Ai: 


A  c<m®(*i  *a)  Ill’ll  =  2 

*j  .Aj€Q 


con%(xi.Ai  { 


A  ^  *„)  V 

Mn-AnCQ 


(  V  con% 

l<i<» 


(*l  ^ 


o.w. 


In  order  to  estimate  whether  or  not  two  requests  are  related  in  general, 
the  estimates  of  their  dependency  along  individual  chains  are  combined. 
Formally, 


con"(x.i4  -<  jf.B)  =  A  Cion!g{z.A  -<  y.B) 

B€AB~CBAiNS 


7.4.3  Safety 

Our  last  compound  estimate  is  denoted  safe'^{x.A).  It  is  designed  to  ap¬ 
proximate  the  predicate  SAFE(x.a).  This  estimate  has  the  property  that, 
when  true,  it  is  guaranteed  that  request  x.A  is  consistent  with  the  states 
of  all  active  objects  in  the  system.  That  is,  when  the  estimate  is  true,  it  is 
guaranteed  that  request  x.A  does  not  have  a  dependency  on  a  request  for 
an  active  object  that  was  not  recovered  as  part  of  that  object’s  state.  If 
the  estimate  is  false,  though,  the  request  may  or  may  not  be  safe. 

Like  the  other  compound  estimates,  the  safety  predicate  is  built  from 
estimates  of  safety  along  individual  chains.  For  any  4^  i4n*chain,  H,  we 
let  saf€g{xn-An)  denote  our  estimate  of  the  safety  of  request  Zn-A.  along 
chain  H.  When  true,  this  estimate  guarantees  that  request  Zn.i4«  has  no 
dependencies,  along  chain  H,  on  non-recovered  requests  for  object  Ai . 

Before  defining  this  estimate,  though,  some  motivation  is  first  presented. 
Suppose  a  request  Xn-An  is  not  safe  along  some  chain,  H.  That  is,  request 
Xn.An  is  dependent  on  some  non-recovered  request,  Xi-Ai  ^  454,,  by  a 
sequence  of  dependencies  along  chain  H: 

Xl.Ax  ■<R  Zj./lj  -<R  ...  •<R  Xn-An 
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Because  each  of  the  requests  in  this  sequence  is  dependent  on  xi..4t,  it 
follows  that  each  request  Xi.Ai  is  also  unsafe,  along  sub-chain  Hi„i.  Because 
unsafe  requests  are  never  recovered  for  an  object,  it  further  follows  that 
none  of  the  unsafe  requests  in  the  above  dependency  sequence  can  be  part 
of  their  object’s  active  states.  Thus,  if  a  request  Xn>A„  is  unsafe  along  chain 
H,  then  that  request  has  a  non-recovered  dependent  from  each  active  object 
in  the  chain.  Conversely,  if  there  is  an  active  object  in  the  chain,  whose 
active  state  contains  all  of  the  object  Ai  dependents  of  Xn.i4«,  then  Xn-An 
must  be  safe  along  chain  H. 

The  safety  estimate  is  based  on  looking  for  objects  such  as  >1^.  In  par¬ 
ticular,  the  safety  of  request  Xn-A^  along  chain  H  is  estimated  by  examining 
each  active  object,  >1^,  in  the  chain.  For  each  such  object,  the  dependents 
of  request  Xn.A,^  are  estimated  and  tested  to  determine  if  they  are  present 
in  tbe  object’s  active  state.  If  all  such  dependents  are  present,  then  request 
Xn.An  is  safe  along  chain  H. 

=  3t :  ^  0  A  Q  ASji^) 

The  general  estimate  of  the  safety  of  request  x.A  is  built  by  combining 
the  estimates  of  the  request’s  safety  along  individual  chains.  Specifically, 
a  request  is  safe  if  it  is  safe  along  all  chains  of  dependency  &om  active 
objects.  Formally, 

safe'^ix.A)  =  A  A  3afe'Tlj{x.A) 

(BeD  I  ACTait*}  H^BA-CHAINS 

7.5  Using  the  Estimates 

The  compound  estimates  can  be  used  directly  in  the  log  synchronization 
algorithms  in  place  of  the  values  they  approximate.  The  proof  that  the 
algorithms  remain  correct  is  given  in  [Kan89j. 

A  problem  can  arise,  though,  when  the  estimates  do  not  have  access  to 
all  of  the  logs  in  the  system.  At  the  time  an  estimate  is  formed,  some  servers 
may  not  be  functioning.  Because  of  this,  the  estimates  may  have  access  to 
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limited  ordering  information,  based  on  which  server  logs  are  available.  This 
can  lead  to  undefined  estimates,  producing  aborts  of  the  synchronization 
algorithms.  Unfortunately,  there  is  no  way  around  this  problem.  When  an 
abort  occurs,  the  server  or  servers  involved  must  simply  wait  until  addi¬ 
tional  failed  servers  have  recovered  (providing  additional  ordering  informa¬ 
tion)  and  then  re-execute  their  synchronization  algorithms. 

Limited  information  does  not  always  lead  to  undefined  estimates,  how¬ 
ever.  In  order  to  approximate  the  dependencies  along  a  particular  AB- 
chain,  H,  an  estimate  considers  all  AS-subchains  of  H.  As  long  as  an 
approximation  can  be  formed  along  some  sub-chain  of  H,  the  estimate  will 
be  defined. 

Another  problem  can  arise  when  a  synchronization  algorithm  adds  or 
deletes  a  request  from  a  server’s  log.  Because  the  algorithms  use  only 
estimates  of  the  dependencies  in  the  system,  it  is  possible  that  a  synchro¬ 
nisation  algorithm  may  believe  that  a  request  needs  to  be  added  or  deleted 
from  the  state  of  an  active  object.  When  this  occurs,  it  is  a  sign  that  the 
estimates  do  not  have  access  to  sufficient  ordering  information  to  deduce 
accurate  approximations.  In  these  cases,  the  invoking  server  should  abort 
the  synchronization  algorithm  and  wait  for  additional  servers  to  recover 
(with  additional  ordering  information  for  the  estimates). 

8  Special  Systems 

When  long  chains  exist  in  a  system,  the  compound  estimates  of  the  previous 
section  can  be  fairly  expensive  to  compute.  In  order  to  form  an  estimate 
along  a  particular  chain,  H,  the  compound  estimates  recursively  sub-divide 
the  chain  and  combine  simple  estimates  from  each  of  the  sub-divisions. 
Unfortunately,  the  number  of  sub-divisions  of  a  chain  grows  exponentially 
with  the  length  of  the  chain.  Thus,  when  a  chain  is  long,  the  number  of  sub¬ 
divisions  that  are  considered  by  the  estimates  is  large.  The  estimates  may 
therefore  be  quite  expensive  to  compute.  And  in  turn,  the  synchronization 
algorithms  will  be  expensive  to  run. 
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In  order  to  reduce  this  cost,  the  basic  estimates  can  sometimes  be  used  in 
plate  of  the  compound  estimates  in  the  synchronization  algorithms.  Unlike 
the  compound  estimates,  the  basic  estimates  do  not  involve  recursion  and 
are,  in  general,  fairly  cheap  to  compute.  Unfortunately,  the  basic  estimates 
can  only  be  used  to  approximate  relationships  between  two  objects,  A  and 
B,  if  the  server  sets  of  those  objects  overlap.  Thus,  in  order  to  replace  the 
compound  estimates  with  the  basic  estimates,  it  must  be  the  case  that  every 
pair  of  related  objects  have  overlapping  server  sets,  regardless  of  whether 
the  objects  are  directly  or  transitively  related. 

Of  course,  in  general  the  server  sets  of  all  related  objects  will  not  overlap, 
and  so  the  basic  estimates  will  not  be  able  to  be  used.  Even  if  the  server  sets 
do  overlap,  however,  the  basic  estimates  are  not  guaranteed  to  be  defined 
at  all  times.  As  pointed  out  in  the  previous  section,  server  failures  can 
cause  estimates  to  be  undefined.  The  exact  estimates  that  are  defined  at 
any  given  time  depends  on  the  servers  that  are  functioning;  that  is,  the 
estimates  that  are  defined  depends  on  which  server  logs  are  available  to  the 
estimates. 

There  is,  however,  an  interesting  class  of  systems  in  which  the  basic 
estimates  are  always  defined.  We  call  this  class  the  backward  inclusion 
systems: 

Definition  8.1  A  system  is  a  backward  inclusion  system  if  the  following 
restriction  holds  on  the  server  sets  of  objects: 

^A,BeR:  A^rB  SERVbQSERVa 

Intuitively,  a  system  is  a  backward  inclusion  system  if  any  server  that  man¬ 
ages  a  replica  of  an  object,  B,  abo  mantles  replicas  of  all  objects  on  which 
B  depends.  This  restriction  implies  that  if  a  server  logs  a  request  x.B,  then 
it  also  logs  all  dependents  of  x.B. 

The  set  of  backward  inclusion  systems  includes  hierarchically  organized 
systems  such  as  the  one  depicted  in  figure  8.  In  this  figure,  each  object’s 
server  set  is  completely  contained  in  the  server  sets  of  all  objects  above  it  in 
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Fiftm  t:  A  hieratchical  system 

the  hierarchy.  Figure  8(a)  shows  the  tree  structured  dependency  relation¬ 
ship  between  the  six  objects  in  the  system.  Figure  8(b)  shows  the  overlap 
between  the  server  sets  of  the  different  objects. 

The  proof  that  the  synchronization  algorithms  never  abort  in  backward 
inclusion  systems  is  given  in  (Kan89].  The  proof  is  based  on  the  fact  that 
each  server  logs  complete  dependency  information  on  all  of  the  requests  in 
its  log,  and  so  there  is  always  complete  dependency  information  available 
on  any  request  that  is  added  or  deleted  from  a  server’s  log.  It  should  be 
pointed  out  that  the  log  addition  and  deletion  functions  must  be  slightly 
modified  when  the  basic  estimates  are  used.  The  reason  for  this  and  the 
appropriate  modifications  are  given  in  [Kan89]. 

9  Conclusions 

This  paper  presented  a  new  mechanism  for  performing  optimistic  log- based 
recovery  in  distributed  systems.  Unlike  existing  methods,  the  mechanism 
presented  does  not  require  the  maintenance  of  explicit  dependency  informa¬ 
tion.  Instead,  by  requiring  that  the  server  sets  of  related  objects  overlap, 
the  mechanism  is  able  to  estimate  any  needed  dependency  information  from 
the  ordering  of  requests  in  servers’  logs. 

In  addition,  the  mechanism  avoids  the  use  of  process  rollback  as  a  syn- 
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chrooization  technique.  When  a  server  first  recovers  from  failure,  its  state 
(the  state  represented  in  its  log)  is  brought  into  agreement  with  the  state 
of  the  system.  A  server  is  never  allowed  to  recover  in  an  inconsistent  state. 
However,  in  order  to  ensure  this,  a  recovering  server  may  have  to  be  blocked 
until  sufficient  information  is  available  in  the  system  to  deduce  that  the 
server’s  state  is  consistent.  Because  of  this  potential  for  blocking,  a  re¬ 
stricted  set  of  systems  (the  backward  inclusion  systems)  were  presented  in 
which  blocking  never  occurs  and  in  which  inexpensive  dependency  estimates 
can  be  used. 

It  should  be  pointed  out  that  our  mechanism  makes  no  guarantees  about 
the  consistency  between  the  states  of  clients  and  servers  when  the  client  and 
server  sets  differ.  Because  of  failures,  a  server  may  lose  a  client  request. 
When  this  happens,  our  mechanism  only  ensures  that  the  states  of  different 
servers  will  be  brought  into  agreement.  It  makes  no  attempt  to  coordinate 
the  states  of  both  clients  and  servers.  In  some  applications,  for  exaunple 
the  sample  printer  service  described  in  this  paper,  the  loss  of  client  requests 
is  not  critical.  In  many  other  applications,  however,  consistency  between 
clients  and  servers  is  crucial.  In  applications  such  as  these,  our  mechanism 
requires  that  the  sets  of  clients  and  servers  be  identical. 

Our  mechanism  can  be  extended  to  enforce  forms  of  consistency  other 
than  causal  consistency.  As  described  in  {Kan89],  the  basic  approach  of 
estimating  and  ensuring  dependencies  can  be  used  to  ensure  an  atomic 
form  of  consistency.  In  the  atomic  form,  a  set  of  requests  can  be  grouped 
to  form  a  set  with  the  property  that  no  request  in  the  group  is  recovered 
after  a  failure  unless  all  of  the  requests  in  the  group  are  also  recovered. 
By. combining  this  atomic  form  of  consistency  with  the  casual  form,  it  may 
even  be  possible  to  derive  a  serializable  form  of  consistency  implementable 
by  our  basic  mechanism. 

One  problem  that  remains  to  be  addressed  is  that  of  restricting  the 
sizes  of  logs.  As  we  have  presented  them,  logs  can  grow  without  bound. 
Clearly,  in  any  implementation  of  the  mechanism,  the  growth  of  logs  must 
be  limited  through  the  use  of  checkpoints.  The  main  difficulty  involved  in 
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mamtaining  checkpoints  is  estimating  the  dependencies  that  exist  between 
different  checkpoints  and  between  requests  and  checkpoints.  A  detailed 
discussion  of  this  problem  and  its  solution  is  provided  in  [Kan89j. 

Although  the  synchronization  algorithms  presented  in  this  paper  have 
not  yet  been  implemented,  we  believe  that  doing  so  should  be  fairly  straight 
forward.  For  example,  in  the  case  where  the  basic  estimates  are  used,  object 
synchronization  amounts  to  little  more  than  sorting.  One  of  the  problems 
with  building  an  implementation,  though,  is  finding  applications  on  which 
to  test  and  measure  its  performance.  Currently,  applications  with  a  high 
degree  of  object  inter-dependence  are  rare.  Because  of  the  increasing  use  of 
object  oriented  interfaces,  however,  we  believe  that  such  applications  will 
become  increasing  common. 
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